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■ Abstract We provide rigorous and exact results characterizing the statistics of spike trains in a network of leaky 

(N : 

T-H , Integrate-and-Fire neurons, where time is discrete and where neurons are submitted to noise, without restriction 
on the synaptic weights. We show the existence and uniqueness of an invariant measure of Gibbs type and discuss 
its properties. We also discuss Markovian approximations and relate them to the approaches currently used in 
^ computational neuroscience to analyse experimental spike trains statistics. 
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1 Introduction 

The neuronal activity is manifested by the emission of action potentials or spikes. While the shape of an action 
potential is essentially constant for a given neuron, the succession of spikes {spike train) that a neuron is able 
to emit, depending on its state and in response to excitations coming from other neurons or external stimuli, is 
simply overwhelming. About twenty different spike trains forms are classified in the literature |24) . It is widely 
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believed by the neuroscience community tiiat spike trains emitted by a neuron assembly constitute somehow a 
"code" and deciphering this code is a big challenge [42| . 

Spike train are usually not exactly reproducible when repeating the same experimeniy, even with a very 
good control ensuring that the experimental conditions have not changed. Therefore, researchers are seeking 
statistical regularities in spike trains. For this, they define statistical indicators such as firing rate, probability 
of spike coincidence, spike response function, spike correlations (see 14211171121] for a comprehensive introduction 
to spike train analysis). An early step for "reading the code" is therefore to provide an accurate model for spike 
train statistics, i.e. a probability distribution "fitting at best" the experimental data and/or matching what 
neuroscientists believe relevant in neurons communication via spikes. 

For example, it has been long believed that firing rates (the probability that a neuron emits a spike in a certain 
time interval) were carrying most of the "information" exchanged by neurons. As a consequence the canonical 
statistical model, namely the probability distribution which reproduces the firing rates without additional as- 
sumptions, is a Bernoulli distribution (possibly with time dependent probabilities), and the probability that a 
given number of spikes is emitted within a definite time interval is Poisson. Actually, there are many mechanisms 
in the nervous system, such as the muscle commands 1 , working essentially with rates. But more recent ex- 
periments evidenced the role of spikes timing, spike ordering, spike synchronization, in processes such as vision 
|53II54II48| or interactions between perception and motion [22II41II23] . Here, one has to consider more elaborated 
statistical models, such as the (weak) pairwise interactions model proposed by Schneidman and collaborators [IS] 
in experiments on the salamander retina. Consequently, there is an intensive activity and wide debates, focusing 
on the determination of statistical models of spike train statistics, with a clear evidence: distinct statistical models 
lead to fundamentally distinct characterizations of the mechaiusms at work in the piece of nervous system under 
study [37]. 



^ Although, retinal responses to a natural image, seem to be almost reproducible spike by spike |39| 
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Clearly, obtaining statistical models from experimental data or selecting a model among many others are 
difficult task. Forgetting about all the experimental difficulties to obtain "clean" data with a good control on 
parameters experiments, one has still to solve delicate questions such as the control of finite sampling effects 
(finite duration, finite numbers of experiments) , extrapolation of the probability distribution characterizing small 
neural assemblies to a large population of neurons [43) . non stationarity, effects of synaptic plasticity or adaptation 
mechanisms [58^. As a consequence, there is no general recipe to extract a statistical model from data and several 
approaches have been proposed |49ll40ll36] . 

It appears simpler to characterize spike trains statistics in neural networks models where one controls exactly 
the neural network parameters, the number of involved neurons, the number of samples, and the duration of 
the experiment (with a possible mathematical extrapolation to infinite time). Especially, producing analytical 
(and when possible, rigorous) results on those statistics provides clues toward resolving the delicate experimental 
questions raised above, with possible outcomes toward new algorithms for data treatments [59]. Obviously, for 
this, one needs models which are a good compromise between analytical tractability and biological realism. 

Generalized Integrate-and-Fire (gIF) models [H] constitute a good example of this. Besides the fact that 
they capture the conductance-based mechanisms for spike generation, without focusing to much on the biological 
machinery, it has been shown by authors like Jolivet et al. [271128] that they are "good enough" to reproduce 
spike trains from real neurons. Moreover, these models allow the analytical characterization of their dynamics 
[11) . Further simplifications of gIF models lead to the Leaky Integrate-and-Fire (LIF) model, which was in fact 
the first proposed to model neuron dynamics (in 1907!) ,33 . In this setting, prominent mathematical results on 
spike statistics in the presence of noise, have been published. For example, Brunei and Hakim obtained a complete 
characterization of LIF models with noise and strong dilution of synaptic weights, using a mean-field approxima- 
tion and assuming that the synaptic weights are inhibitory [S]. Also, Touboul and Faugeras [56| obtained rigorous 
results on the probability distribution of inter-spike intervals for one LIF neuron submitted to noise. They recently 
extended their results in [57| to networks of IF neurons of several type considering different types of interactions 
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and conclude that the spikes times can be modelled as a Markov chain. 

In this paper, we proceed along similar lines, although using difTerent methods and raising different conclusions, 
and make a complete characterization of spike train statistics for the discrete-time leaky Integrate-and-Fire model 
with noise and time-independent stimuli. This is somehow a continuation of the paper proposing a complete 
classification of dynamics in this model, without noise, and of [11] extending these results to gIF models. It also 
rigorously supports the main assumption made in [5] where it was proposed to characterize spike train statistics 
in neural networks by Gibbs distributions, with an emphasis on synaptic plasticity effects. Here, we propose a 
framework allowing to handle dynamics with noise, with possible extensions to more realistic neural networks 
models such as gIF (see the conclusion section). We emphasize that we do not use any simplifying assumption 
in the model. Especially, our results hold for finite-sized networks, without restriction on the synaptic weights 
(except that they are finite) and all type of synaptic graph structures are allowed. Also, we are not constrained 
by an ad hoc choice of the initial conditions distribution of membrane potentials; instead we propose a procedure 
where this distribution is selected by dynamics and is uniquely determined, as we show. 

Moreover, this work attempts to bridge a gap between the mathematical characterization of spike train statis- 
tics and empirical methods or algorithms currently used by the neuroscience community. As a consequence, this 
paper addresses to 2 distinct communities. On one hand, to specialists from mathematical statistical physics and 
ergodic theory, as far as the mathematics of this paper are concerned. From this point of view the results exposed 
here are direct applications of classical results in ergodic theory. But, to the best of our knowledge, it is the first 
time that they are used in this context. On the other hand, this paper addresses to neuroscientists. Studying a 
fairly simple model, from the biological point of view, we nevertheless obtain conclusions which could be useful 
for the characterization of spike trains in real experiments, with concrete applications toward implementation of 
software for spike train analysis. 
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The paper is organized as follows. In section [2] we define the model and infer some preliminary results. 
Especially we compute explicitly the probability that neurons fire at time t given the past. This defines a transition 
probability which is the main object of our study. A salient result obtained in section [2] is the fact that this 
transition probability is non Markovian, it depends on an unbounded past. This defines a stochastic process, 
known under the name of "chain with complete connections" ([35] and references therein), which is studied in 
section[31 Especially, we show that there is a unique invariant probability measure (equilibrium state) whatever the 
model-parameters values, which satisfies a variational principle and is a Gibbs distribution. We also show that the 
entropy of the discrete-time leaky Integrate-and-Fire model with noise is always positive. In sectionUwe propose 
a Markovian approximation where memory depends on R time steps. This approximation allows the computation 
of the main quantities used in neuroscience for the characterization of raster plots statistics. The computation 
of these quantities is done in section [S] We also show that, in this approximation, the equilibrium state can also 
be obtained via the Jaynes principle of statistical physics (maximizing the entropy under constraints), and we 
discuss in which sense the statistical models used in neuroscience community are approximations whose degree of 
accuracy can be controlled. 



2 Definitions and preliminary results 

2.1 Model definition. 
2.1.1 The neural network. 

Fix > a positive integer called "the dimension of the neural network" (the number of neurons). Let W be a 
N X N matrix, called "the matrix of synaptic weights" , with entries Wij . It defines an oriented and signed graph, 
called "the neural network associated to W" , with vertices i — 1 . . . N called the "neurons ^ " . There is an oriented 



Therefore, "neurons" are points here, i.e. they have no structure. 
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edge j ^ i whenever Wij 7^ 0. Wij is called "the synaptic weighl[_| from neuron j to neuron i" . The synaptic 



weight is called "excitatory" if Wij > and "inhibitory" if Wij < 0. We assume that the synaptic weights are 
bounded i.e. Wij £ [Wmin,Wmax]yi, j where —00 < W„iin < Wmax < +00. Moreover, in this paper, the Wij's 



do not evolve in time. 



2.1.2 Membrane potential. 

Each vertex (neuron) i is characterized by a real variable Vi called the "membrane potential of neuron i". Fix 
a positive real number > called the "firing threshold". Let Z be the function Z{x) = ^ ^) where x is 
the indicatrix function. Namely, Z{x) = 1 whenever x > and Z{x) — otherwise. Z{Vi) is called the "firing 
state of neuron i". When Z{Vi) = 1 one says that neuron i "fires" or "spikes" and when Z{Vi) — neuron i is 
"quiescent". We extend the definition of Z to vectors: Z{V) is the vector with components Z{Vi), i = 1 . . . N . 

2.1.3 Dynamics. 

Fix 7 G [0, 1[, called the "leak rate". The discrete time and synchronous dynamics of our model is given by: 

V{t + l) = F{V{t)) + aBB{t), (1) 

where ag > 0, V = {Vi)^=i is the vector of membrane potentials and F — (Fi)^-^ with: 

N 

F,{V) = -yV, (1 - Z[Vi\) + ^ W,jZ[Vj] +h; i = l...N. 

We assume that initial conditions belong to some compact set in (i.e. the initial membrane potentials are 
bounded). The variable li is called "an external input applied to neuron i". We assume in this paper that it does 
not depend on time. 

^ On biological grounds, tfiis corresponds to the maximal amplitude of the post-synaptic potential generated, at the 
dendrite connecting the pre-synaptic neuron j to the post-synaptic neuron i, when neuron j emits an action potential. 
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2.1.4 Noise. 



The vector B{t) — {Bi{t))^_^ is an additive nois43- It has Gaussian identically distributed and independent entries 
Bi{t) with zero mean and variance 1. We note N{Q, 1) the standard Gaussian law and : 



7r(a::) = 



+00 2 



e 2 du. 



The parameter gb in {!]) tunes the noise amplitude. 



2.1.5 Interpretation. 

To the best of our knowledge this model has been first introduced by G. Beslon, O. Mazet and H. Soula [55] ■ It 
belongs to the family of the so-called leaky-Integrate-and-Fire models [21]. Its interpretation is the following. A 
neuron "fires" i.e. emits an action potential (or "spike") whenever its membrane potential exceeds the threshold 
9. Here a spike is modelled by the function Z. For an isolated neuron, firing corresponds, in the model, to the reset 
of the membrane potential to a rest value Vrest = 0. In a network, each neuron i receives spikes from pre-synaptic 
neurons. When a pre-synaptic neuron j emits a spike this modifies the membrane potential of neuron i by an 
amount Wij. Thus, according to eq. U]), when a neuron fires, it immediately receives inputs from other neurons 
and from the environment (the constant input and the noise) (hence its value at the next time step is different 
from zero in general). If a neuron does not fire and does not receive infiuences from other neurons or input, then 
its membrane potential decays exponentially fast with a decay rate < 7 < 1. The discussion of the biological 
relevance of this model and its extensions towards more elaborated models with adaptive conductances has been 
done in 

* On phenomenological grounds, it mimics effects such as noise in synaptic transmission (neurotransmitters diffusion), 
randomness in ionic channels transitions, or effects of hidden degrees of freedom. 
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2.2 Technical definitions. 

2.2.1 Spiking sequences. 

Call A4 — the phase space of our dynamical system. Given two integers s < t (possibly negative) we 
note Vg the piece of trajectory V{s), . . . , V{t). To each membrane potential value, Vi{t), we associate a variable 
uji{t) = Z{Vi{t)). The "spiking pattern" of the neural network at time t is the vector uj{t) — {i^iit))^^: it tells us 
which neurons are firing at time t, (ujiit) = 1) and which neurons are not firing at time t {ijJi{t) = 0). We denote 
by cj* the sequence or spike block tj(s) . . .u!(t). Associated with each piece of trajectory Vs there is a unique 
spike block ujI with uji{n) = Z(Vi{n)), i = l...N,s<n<t. We note Z{Vs) = ui^. Also we note cj^^ajj^ = uil the 
concatenation of the blocks o;*^ and uil^ . 

2.2.2 Raster plots. 

Call A the set of spiking patterns (alphabet). An element of , i.e. a bi-infinite sequence uj — {'^(^)}ttf^oo '^^ 
spiking patterns, is called a "raster plot". It tells us which neurons are firing at each time t £ Z. In experiments 
raster plots are obviously finite sequences of spiking pattern but the extension to especially the possibility 
of considering an arbitrary distant past (negative times) is quite useful in the present work. The set is a 
topological space for the product topology [35]. The open sets are the cylinder sets, namely the sets [ljI] = 
|a;' G A^, Ld' (n) = ui{n), n = s, . . . ,t^ . Cylinder sets are also a countable basis for the cr-algebra in A^ . There 
is a natural distance on A"^, 




if uj = Lu' , 



u) and ijj' differ for the first time in the n-th spiking pattern; 



(2) 



for some < 9 < 1. A classical choice is = ^. Here, it can be convenient to take — 'y. 
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2.2.3 Last firing time. 



For {s,t) £ Z^, s < t, and each i = 1 . . . A'^, we define the "last firing time of neuron i in the sequence ujI" by: 



t. dof 



s, if uji{k) — 0, k = s,...,t; 



(3) 



max{s < k < t,uji{k) — 1} , 



if 3k£{s,...,t} such that uji{k) = 1. 



Therefore, Ti{uJs) — s either if neuron i fires at time s or if it does not fire during the whole time interval [s,t]. 
In this way, the name "last firing time" is a little bit confusing, but this has no incidence on the mathematical 
developments. 

2.3 The asymptotic probability distribution of membrane potentials and raster plots. 
2.3.1 Conditional probability distribution ofV{t+ 1). 

Call P — A/'(0, 1)®^^^ the joint distribution of the noise trajectories. Under P the membrane potential y is a 
stochastic process whose evolution is given eq. ([T]). Fix a pair of integers (s, t), s < t. The probability distribution of 
V{t+ 1) can be explicitly obtained with the following remark. Since the cylinder sets [oj*] constitute a (countable) 
basis for the a-algebra in and since to each piece of trajectory is associated a unique sequence oj*, we 
consider first the probability distribution of V{t + 1) conditioned by Z{Vs ) = ojI and by the initial condition V{s), 
assumed here to be bounded. Then, the following holds as easily checked with a few algebra: 

Proposition 1 For each {s,t) £ Z^, s < t, conditionally to Z{Vs) = ujI, and given V(s), 




Vi{s) + Ci{ojg) + aB^i{ojs), if neuron i didn't fire in the time interval [s,tj; 



otherwise. 



(4) 



where 




(5) 



t 




(6) 
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U^l)^ J2 i^'bs). (7) 



Remark. 



This equation expresses that the neuron loses its memory whenever it fires. This is due to the fact that we 
reset the membrane potential, after firing. This consequently simplifies the following analysis. For a discussion 
on dynamics of spiking neural phase models when the condition is relaxed see [31| . 

Clearly, the membrane potential is the sum of a "deterministic" part, 7*^"^~'*Vi(s) + Ci{oj\), fixed by initial 
condition at time s and by the spike sequence oj\, and a stochastic part, (TB£,i{'^\), where the probability 
distribution of the noise is also fixed by the spike sequence a;*. More precisely, since the B^s are 

independent, Gaussian with mean zero and variance 1, the ^i(aj*)'s, i — I...N are, under P, Gaussian, 

. , , . , , . l_T,2(t+l-Ti(^*)) 

independent, with zero mean and variance '■ — j^^p . 

Denote by E [] the expectation under P. It follows that: 



Proposition 2 For each {s,t) G Z^,s < t, conditionally to Z(Vs) = uil, and gtven V(s), V{t+ 1) is Gaussian 
with mean: 



E[v,{t+l)\ulV{s)j = 



^t+l ^y^(^s) + Ci{u]l), if neuron i didn't fire in the time interval [s,t]\ 
Ci{u)\), otherwise. 



and covariance: 



Gov \^/,{t + l),V,{t + 1) 1 Js,V{s)\ = a2(a;*)5, 



with: 



2 t 2 l_^2(t+l-r,(^^)) 

CJi{u)s) = CTB z 2 • 

1 — 7^ 

Thus, the Vi{t -\- l)'s, i = 1 . . . N , are conditionally independent. 

Remark We used a slight abuse of notation since we condition by ojI instead of Z{Vs) ~ oj^. 
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2.3.2 The probability that some neuron i does not fire within the time interval [s,t] 
It is given by: 

t / n-1 \ 

uj*eA^-^ n=s+l \ l=s ) 

t / n-1 \ 

E n ^ < ^> 1 n {^*(^) < ^} P {ms) < 9} I cf) P (a;*) 

g_4t-s n=s+l \ l=s ) 



From prop. 0, we have 



ri-l 



l=s I 

where Ci(a;""^) and are given by ([5l),((6l),((7ll with rj;(tJs) = s. Since, in this case, is Gaussian, 

1_ 2(7i-s) 

centered, with variance — — (eq.[Sl) we have: 

n-i\ , [e-i'-'v,{s)-c,{^T^) 

1 — 1 — TT ' 



/l_-y2(„- = ) 



Since l^i(s) and the W^j's are assumed to be bounded we have, whatever n > s: 

< i7_ < P ^{V^{n) < 0}\ f] {V^il) < 9}n ljI'-^^ <n+<i, (9) 

for some constants IT-,!!-!- depending on parameters ■y,Wij, i,j = 1 . . . N, 1^, i = 1...N. Likewise, < a < 
P{{Vi(s) < 9} \ iOs) < b < 1. Without loss of generality, e.g. redefining i7_ as min(i7_,a) (redefining i7+ as 
max(J7-(-, 6)) we may write < i7_ < P ({Vi(s) < 9} | ojf ) < 11+ < 1. As a consequence. 

Proposition 3 The probability that some neuron i does not fire within the time interval [s, t] has the following 
bounds: 



< n^s' < P ( n {v^{n) < 9}] < n^_^' < 1. 

\n=s / 



As a consequence, whatever s < t, t — s finite, there is a positive probability that some neuron i does not fire within 
the time interval [s,t]. This probability vanishes exponentially fast as \t — s\ — > +oo. 
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2.3.3 Permanent regime 

The main drawback of the previous results is that we have to condition on the "initial" condition V{s) for spiking 
sequences such that some neuron does not fire between s and t. But the probability distribution of V{s) is not 
known. It has either to be "guessed" from ad hoc assumptions: is it Gaussian, uniform, "fractal" ... ? Actually, 
the determination of initial conditions distribution is, to our opinion, one of the main obstacle toward realistic 
characterizations or simulations of neural network models, intended to somehow mimics the dynamics (of some 
part) of the brain, at some stage of its evolution. Indeed, when considering the evolution of a set of neurons, 
one starts from some "initial" time s which corresponds to the beginning of the experiment. This is NOT the 
beginning of the system under study, which has undergone a previous evolution that actually determines the 
distribution of membrane potentials at time s. This distribution has little chances to be Gaussian or anything 
so mathematically "convenient", unless one finds strong arguments to justify this. Actually, as we show, such 
assumption is wrong in model ((T}. Therefore, to compute the distribution of membrane potential at time s one 
has to consider the previous evolution of the system, which only postpones the problem,... unless one assumes 
that this initial condition was drawn in an infinite past. On phenomenological grounds, "infinite past" means "a 
time quite longer than all characteristic time scales in the system" , though, mathematically, one may take it truly 
infinite. This is what we do here, focusing on what we call a "permanent regime" (by analogy with Physics) where 
the initial condition is fixed in the infinite past, namely s — )■ — oo. In this case, indeed, 7*+^"''Vi(s) ^ 0. As we 
show, this procedure selects a unique probability distribution for membrane potentials, with a highly non trivial 
structure (see eg. I33|) . 

We therefore consider left-infinite sequences with corresponding last firing time: 



t 



dof 
— < 



OO, if uji(k) = 0, Vfc < t; 



Ti{uj'. 



-oo 



) 



(10) 



max{— oo < k < t,uii{k) — 1} otherwise. 



We now show that proposition [2] extends as well to the case s — > — oo, namely: 



13 



Proposition 4 For each t £ Wi, conditionally to uj_^, V{t + 1) is Gaussian with mean: 

r 1 ^ 1 - -yt + ^-'^^i'^-oa) 

E \v,it + 1) 1 = C,{o^'-oo) = Yl W,,x,j{J_^) + I, 1^ , (11) 

where, 

t 

Xij{ujtoa)^ ^ 7*"'tjj(Z), (12) 

and covariance: 

Gov \v,{t + l),Vj{t + 1) I ojLoo] = -rf (a;ioo)5zj = (13) 
Thus, the Viit -\- l)'s, i — 1 . . . N , are conditionally independent. 

Proof Let us first show that the quantities defined by eq. ([SI),®,!!?! are well defined in the limit s — !> — cx). Consider 
first the limit of Xij{ujs) given by eq. There are two possibilities. Either oj^^o is such that Ti{Lu^^) = 
n > — oo. Then, Xij{u}'Lao) is a finite sum X]i=n 7* ''^j(0 and is well defined. Or, Ti{uj*L^) — — oo. Then 
Xij{uj*Lrx,) = X]*=-oo W ~ IZt^ 'y^'^ji^ ^ 0- This series converges since 7 < 1 and u}j{t — I) — 0,1. 
Moreover, S,i{u}l) = X^*_7..(^t') 7*~'Bi(0 is a sum (possibly infinite) of independent Gaussian centered variables 
with finite variance. As a consequence, 

t 

U^-oo)= Yl -y'^'BM, (14) 

is Gaussian centered with variance — - — izryi ■ Finally, from Q Vi{t + 1) = Ci(cj_oo) + cs^j (1^-00 )> and 

the proposition follows from the independence of the ^i{oj^_^)'s and their Gaussian distribution. □ 

2.3.4 Elementary bounds. 



We have: 



< a;y (aj*_oo) < C^^) 
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and, 



dof 



1 ^ 1 

h + — V w,, < c,{J^^) < - — 

1 — 7 1 — 



j=i 



dcf „+ 



In the same way 



2 

1 ^ 1, t X ^ cr_B 
1 — 7^ 



(16) 



(17) 



^.5.5 The transition probability. 

We now compute the probability of a spiking pattern at time t + 1, uj{t + 1), given the past uj-ao- It is given by 
the following: 

Proposition 5 The probability of (jj(t + 1) conditionally to uj*L^ is given by: 



N 

p(u;{t+l)\Lotoo,) =Ylp(LO,{t+l)\o 



i=l 



with 



(18) 



' {..,{t + ^.,{t + l)n + (1 - ..(t + D) (l - n ( ' ) . (19) 



Proof We have, using the conditional independence of the Vi{t + l)'s: 

N 



P (u;{t + 1) 1 a;ioo) = J] + 1) ^ (^«(* + 1) > ^ 1 ^-oo) + {1 - u;,{t + 1)) P + 1) < ^ [ cji^o)] ■ 

Since the Vi(t + l)'s are Gaussian, with mean Ci{uj*_^) with a variance o-^(tiJ^oo) directly obtain (|18p . (|19p . 

n 

Consequently, it is possible, knowing the past sequence oj^^, to determine the probability of the spiking 
pattern uj{t +1). In this way, P (uj{t + l)\oj^^, ) acts as a transition probabihty, as in Markov chains. But here, 
the length of the Markov chain depends on the last firing time of each neuron, since in fact. 



' (uj,{t + l)|i^ioo) = P [^t{t + l)|a;*^(„t_^)) 
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The problem here is that, according to prop. |3l we cannot bound Ti{ijj^^). Although this time Ti(uj^_^) is 
almost-surely finite, nevertheless, whatever T > 0, there is a positive probability set of sequences uj such that 
Ti{uj*_r^) < t — T. So we have to consider a process where transition probability may have an unbounded memory. 
This type of process is called "variable length Markov chain" [5^ . Such processes can be studied in the general 
context of chains with complete connections and g -measures, developed in section [S] 

2.3.6 Stationarity. 

In the present setting where 1^ does not depend on t we have the following property: 
Proposition 6 Fix a sequence a^rx>, a{—n) £ A, n> Q. Then, \/t G 

P (ojit) = a(0) i J-J^ = aZl) = P ('^(O) = a(0) | ^Z^ = • (20) 

Proof Assume that uj{t — n) — a{~n),n > 0, as in the l.h.s of (|20|l . Then, according to eq. (|10|l . Ti{ij^sJ^) = 

t + Ti(aZj^)- Therefore, according to eq. (|12p . 

t-l t-l -1 

and, according to pip . 

Ci{u}*:r^) = Y + f— ' = Y '^^3^^j('^-lo) + h y— = Q(alL)- 

Note that this last property holds because 1^ does not depend on time. We have also, from the same arguments. 

Consequently, 

N 

p(u^{t) = a{0)\J-^=aZl)j=\{ 

1=1 



= P (a;(0) = a(0) 1 = a_^) . 

□ 



+ (l-a;,(t)) l-TT 



cr,j(LJ 



t-l^ 



n 



a,(0)7r 



+ (l-a,(0)) l-TT 
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Therefore, instead of considering a family of transition probabilities depending on t, it suffices to define the 
transition probability at one time t £ Z, for example t = 0. 

3 The equilibrium state. 

In this section we show the existence of a unique invariant probability distribution for the dynamics ([l]) and 
characterize it explicitly. Especially, we show that it is an equilibrium state and a Gibbs state in the sense of 
the thermodynamic formalism in ergodic theory [30lll3j . For this we use the concept of measures 29 , coming 
from ergodic theory, and very close (equivalent in the present setting) to the concept of chains with complete 
connectionjf], which comes from probability theory. However, all known theorems used here being formulated 
in the g-measure context, we use this formulation here. For the convenience of the reader we however provide 
examples and illustrations of the used notions. Our main reference are [12ll34lfT4ll 5ll35). 

We proceed in several steps leading us to the theorems 1 1121 the main results of this paper. 

3.1 Definitions and elementary results 
3.1.1 Setting and notations. 

The main object under study here is the family of transitions probabilities (|18l) . Using the stationarity proposition 

[6] we can restrict to transition probabilities of the form P{lj{0)\ijjZ^). Namely, we may focus on sequences in 

A'^oo- From now on we set uj = i^-ooi yi = '^-oo sequence lu_ is called an "history"), X = A^cot ~ -^-oo- 

The a-algebra (the set of cylinders) on X (resp. X) is denoted T (resp. £). 

^ The concept of chains with complete connections, dates back to 1935 I38| . They are a generalization of Markov chains 
with an infinite memory. More precisely, they are induced by conditional probabilities of the form: P(a;(t + 1) | oj^^^). These 
transition probabilities appear to be a extension of the notion of fc-step Markov chain with an infinite fc. These objects must 
be taken with some precautions because, in the non-Markovian case, the conditioning is always on an event of probability 
zero (see |35| for proper definition). The transition probabilities given by eq. II18II . II19I I define a system of such transition 
probabilities. 
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Call T the right shift over X i.e. {Tu}){t) — ui{t — V), t < 0. The use of the right shift, instead of the left shift 
currently used in dynamical systems theory, is related to the formulation of the problem in terms of transitions 
probabilities (see for example We note Loa, the right concatenation of uj and a £ A, namely, this is the 
sequence uj' such that uj' {t — 1) = ^{t), t < and aj'(O) — a. Note that T{uja) = uj. 

3.1.2 g-functions. 

Definition 1 A g-function over {X, T) is a measurable function g : X ^ [0,1] which satisfies, for all uj £ X: 

E 9{^') = 1- (21) 

ui' .,T{lj')=lj 

Examples of g functions are precisely transition probabilities of type (|18|l . Indeed, for ui £ X, set: 



N 



go 



(a;) = PKO)|a;) = n 



+ (l-a;,(0)) 1 



(22) 



Then, by definition, all uj''s in the sum (|2ip have the form u}' — uja, and: 

uj' ,T{uj')=uj a€A 

We now give two properties of go used below. 
3.1.3 go IS non-null 

A g function is non null on X if for all u £ X, g{uj) > 0. We have: 
Proposition 7 The g-function go, given by fSSd . is non-null. 



Proof It suffices to check that P (i^(0) \ ui) > 0. If there exists uj £ X such that P (aj(0) \ui) ~ 0, then, for some 
i £ {1 . . .A''}, TV (j-^j^^^ = or 1. This imposes that either Ci(tj) = ±oo or a-i{Lu) — which is not possible 
since these quantities are bounded by bounds ()16|) . (|17[) . □ 
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3.1.4 go continuous. 

Definition 2 The variation of a g-function g is: 

var-kig) = sup{ 1 g{uj)-g{Lu') \ : u,J £ X,aj(f) = aj'(t), Mt G {-fc, . . . ,0}} . 
Definition 3 A g-function is continuous if vari^{g) — >■ as fc — >■ +oo. 
Proposition 8 go continuous. 
Proof We shall use the following inequalities. 

1. For a collection < a^, 6i < 1, Vi = 1 . . . A'^, we hav«f 

N N N 

\Yla,-Y[b,\<J2W^-b^\, (23) 

i—1 2—1 2—1 

as easily proved by recursion. 

2. For < :r < 1, write \/T^~x = 1 — y^"*"^ fn^"", where fn = / t'>, ^^"'l' ^ > are the series coefficients of 
yJX — X. Then, for A, B real, < w, z; < 1, 

\AVT^-BVT^\ <\A-B\ + J2MAu" -Bv"\ <\A-B\ + J2.fn{\A\u" + \B\v''). (24) 

n—1 71—1 

Fix i G {l,...,iV}. Set yt = '^~f{^f , Vi = ^^^7^^^' C** = C'i(tii).C'i = ^(^i^'). f^i = <^i{yi)."'i = ^^iiU), 
Tj = ''"i(t»j), Tj' = Ti(aj') to alleviate notations in the proof. We have, for fc > 0, 

( N N -\ 

vark{go) lJ]^aj-J]^bij : cj, a;' G A, a;(t) = cj'(t), G {-fc, . . . , 0} > . 

t i=\ i=i J 

where Oj = a;i(0)7r (i/i) + (l ^ ^i{0)) (1 - tt (j/;)), 6^ = tOi{0)Tv (y.Q + (1 - t^i(O)) (l - vr {y'i)). Moreover, since either 
tjj(O) =0 or t<jj(0) = 1, \ai — &i| = |7r(yi) — 7r(y^)|. Therefore, using inequality (|23p . 

AT 

"ci'^fcSo < X^'^^P {''^(^'») ^ 1 • ^ ^''^(*) = (^'{t)yt G {-fc, . . .0}} . 

1=1 

® We thank one reviewer for this useful remark. 
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Fix fc > 0. Then, for all to such that r.i(a;) G { — fc . . . 0}, tt [y-i) = tt {y'^ . Therefore, the sup is realized for those 
aj,tj' such that rj,rj' < — fc. Since 7 < 1 we have therefore < 7*^ and < j-ryr, for r = Ti,T-. We have 

- < \yr - y[\\WU with |k'||oo = Moreover, _ < ^ [ J- - J, [ + 

We have, 

' ^ ' 1 1 



Oi a'. 



<^B 1/1 - 7- 



2ri 



1 - 7"^^' - v/l-7" 



2^1-7^ 

0-3(1 - y^^) 



with 

the last inequality coming from (|24p . 
Likewise, 



+00 



5(7) = E/»^''^""'^ 



C-v/l-7-2-»-Q\/l-7- 



2r,' 



o-b(1 - 7^'") 



+00 



C^-Q| + E/„7''"(|C,'| + |C,|) 



- 7^ ) 



(lQ'^Ql+27''=|C+|S(7)) 



where C^^ is given by H16p . 
We have. 



j=l \l=ri l=Tl J 

-1 -1 



1-7 



7 - 7 



Remark that 



(25) 



l=-k 
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since ijJj{l) = I ~ ~k • ■ • — 1. Therefore, 



Moreover, 



Finally, 



Summarizing, 



l = Ti 1=t' 



27^^ 
1-7' 



I —Ti — T ' I ^ o Mi k 

7 ' - 7 • < 2- 7 

1 — 7 I I 1 — 7 



|C'-C„| < 



N 



<7 



and, 



My^) - AvDW < \l - 



TT 0-5(1 — 72*-') I 1 — 7 



2 ^1^^ 



/AT ^ \ ^ 

E 11^^.1+ El^''l +7'(iVe + ^|G+|)S(7) 



7 



^!,J = l 1=1 

Remarking that ^(7) — >■ /i as fe — > +cx3 -we conclude that varj^gQ — > as fe — > +cx3 and behaves like K^'' , 
where the constant 



N N 

TT OB V 1-7 

l,J = l 1=1 

depends on model parameters (synaptic weights and current) and A'^. □ 



(26) 



Remark. Note that qq is therefore Holderian for the metric ([2]) (with an exponent so it is Lipschitz for 



e = 7). 
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3.2 The Gibbs equilibrium state 

We now prove that the system ((T]) admits a unique invariant probability measure (also called a 5- measure). This 
measure satisfies a variational principle (equilibrium state) and has the form of a Gibbs distribution in statistical 
physics. 

3.2.1 g-measure. 

Definition 4 Let g be a function. A probability measure fi in 'P{X,J^) is a g-measure if : 



Wa £ A and V/ measurable with respect to T^. 

3.2.2 There exists a g-measure for |7P and it is unique. 

Since go is contiimous there is always a g-measure. Now, a theorem of Johansson and Oberg [26] states that if go 
is a continuous, non-null g function on X satisfying: 



Theorem 1 The dynamical system {TP has a unique g^-measure whatever the values of parameters Wij, i,j = 
l...iV,/„ i = l...iV,7,e. 

Proof This follows from the theorem of Johansson and Oberg. Indeed, using a proof similar to prop. [8] and the 
same notations, we have, for fc > 0, and using that 

log K(0)^(i/,) -t- (1 - LJ,{0)) (1 - 7v{y,))] = ^^,(0) log {7v{y,)) + {1 - a;,;(0)) log (1 - 7v{y,)) , 
var k{log go) = 



sup{l Etl h(0) [log(lJ|})] +(l-a;,(0)) [log(i^^)]] ] : c., c^' G X, c.(t) = c.'(f), G {-fc, . . . , 0}} 





(27) 



fe>0 



then the g-measure is unique. 
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N 



< ^ sup 1 1 log 1 + I log (jz^) I e X, .(i) = Jit)yt e {-fc, . . . 0}} 

From the bounds ^6^, (flT)) . 



^ <y»,y» < 



Set a =^ a/1 - 7^ minimi. ..AT and 6 =^ maxi^i...7v < 00. Denote =^ sup^g[^ (,] we 

have: 



warfc(logpo) < 2|| — ^sup {lyi ~ y'i \ : uj,uj' e X,u){t) = io'{t),\/t G {-fc, . . .0}} 



where the norm 

,, TT „ e 2 



"-"-'-/re-*..' 

is finite since h is finite. For the term maxj sup — y[) \ : uj,uj' £ X,u!{t) = uj'{t),\/t G { — fc, . . .0}}, we have the 
same majoration as in the proof of prop [8] Thus, 

uarfe (log po) < K'-y^, 

with 

i^' = ^||^|| i^, (28) 

TT ' ' 

JC' given by (|26|) . It follows that X]fc>o ^'^''fe(loSffo) ^ -^'^ X]fe>o('>'^)'^ ^ Then the po measure is unique. □ 
Let us now characterize the structure of this g-measure. 

3.2.3 The Ruelle-Perron-Frobenius operator. 

For the g-function go define the transfer operator or Ruelle-Perron-Frobenius operator Cgg from C{X, IR) to 
C{X, R), where C{X, R) is the set of continuous real functions on X, by: 



uj' : T{uj')—uj 



(29) 
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Denoting C^g, n > 0, the n-iterates of the RPF operator, -Cgo/ is the conditional expectation of / on the time 
interval [0, n — 1] given the history cj. The Ruelle-Perron-Frobenius is the extension of matrices of probability 
transitions for Markov chains. 

The adjoint of Cg^ maps the set of probability measures on X to itself and is defined by: 

A probability measure ^ on X is a g-measure if and only if Cg^ ^ = /x [31] . 
3.2.4 Equilibrium state. 

Let i/> be a continuous function X — IR such that Yl'kLo '^'^^fc(^) < °° (also called a regular potential [30] )• Call 
Pt{X) the set of T-invariant finite measures on X. For ^ £ Pj^{X) let 

h{fi) = limsup m(Ho) logM(Ho)' (30) 

be the entropy of /i, where the sum holds over all cylinders [lu]q of length n + 1. Note that the entropy can be 
defined in a more general setting (see 30 ). Here we take a definition which corresponds more to the one used in 
neural networks dynamics analysis. 

Definition 5 An equilibrium state, /i^, is a T-invariant measure on X, such that: 

P(7A) ='/i(M^)+/i.0(V')= sup ft(M)+M(^)- (31) 

The quantity P('0) is called the "topological pressure" [51ll4ll46] . This is a fundamental quantity and we come 
back to it in section [J] It is zero whenever the potential tp is normalized, which is the case here since -0 is the log 
of a conditional probability. 

Ledrappier has shown [53] that if -0 is a regular potential then the equilibrium states for tp are the g-measures 
for a continuous g function. In our case where the g-measure is unique, go is related to the potential ^ given by: 

^(uj) = loggo(^)- 
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Therefore, the following holds. 

Theorem 2 Whatever the parameters values the system {ip has a unique go probability measure, fi^, which is 

an equilibrium state for the potential 

N 

= V'(t^-oo) = logSro('^) = X! 

(32) 



3.3 Consequences. 

3.3.1 Asymptotic distribution of membrane potentials 
A first consequence of proposition |4] and theorem [T] is: 

Proposition 9 The membrane potential vector V is stationary with a product density — YliLi Pv i^i) 

where: 

Its expectation is /i^ [C'i(a;)] and its variance /i^ [o"^(<±i)] • 

Comments. This density is a "mixture" of Gaussian densities, but it is not Gaussian. Each Gaussian density 
in the decomposition depends on a specific history yj_, and the integral holds on the set of all possible histories 
with a weight /i^, (t^j). Therefore, to obtain a closed form for the stationary density of membrane potential we need 
to know the invariant probability /i^ which weights the possible histories over an unbounded past. It has therefore 
a highly non trivial structure as announced in section [2.3.31 

3.3.2 Firing rates. 



Define: 



(34) 
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the probability that neuron i fires at time given the past a; and, 



ri =^^^(^^2(0) = 1), 



called "the firing rate" of neuron i. We have: 



(35) 



(36) 



3.3.3 Entropy. 



It results from (ISljl and the normalization of the potential tp that = h{^^) + ('(/'). Therefore: 



AT 



i=l ^ 

Since either uiiiO) = or 1, we have: 



crj(aj) 



) + (l-a;,(0)) log(l-^ 



AT 



cri(cj) 



) + (1 - '"i)^^ log(l - TT 



and finally, 



AT 



(37) 



This looks like the classical entropy for a Bernoulli scheme but with a crucial difference: one has to take the 
expectation of the log of the probability instead of the log of the expectation. 

Moreover ^(oj) < (the strict inequality comes from proposition [7| . Therefore, ^^(^/)) < and: 

Proposition 10 The entropy h{fi^), given by {37] ), is positive whatever the value of parameters Wij,i,j = 
l...N,I„i = l...N,e,y. 

Though this result appears "evident" a priori, it is proved here, as an easy consequence of the variational 
principle pip . Moreover, it provides an explicit value for the entropy, which depends on parameters. 



26 



3.3.4 Gihbs state. 

In the present setting equilibrium states are Gibbs states [5D]. A Gibbs state for the potential ■;/) is a probability 
measure such that one can find some constants P{il'), ci, C2 with < ci < 1 < C2 such that for all n > and 
for all Lu G X: 

1 - exp [-{n + l)P(V) + ELo - ^ ^ 

Basically, the condition (|38|l expresses that the measure of the cylinder [uj]q behaves like: 

which has therefore the classical form of Gibbs distribution where spins chains are replaced by sequences of spiking 
patterns and where the normalization factor Z^^^'^^\u}) is analog to a "partition function" (but depends on uj). 
Note that ~ linirn.+oo i^^r^ ^og Z^^^\u}) , where the limit exists and is constant for /i^-almost-every uj. 

Thus the topological pressure P{tp) is analog to a free energy density. 

3.3.5 Kullback-Leibler divergence. 

Let ^, V be two T-invariant measures. The Kullback-Leibler divergence between /i and v is given by: 



v) = limsup fi {[oj]^) log 



'(Ho) 



(40) 



n— >+oo ^ ^ 

where the sum holds on all possible cylinders [ijj]q. It provides some notion of asymmetric "distance" between fi 
and u. Minimizing this divergence, corresponds to minimizing "what is not explained in the empirical measure /x 
by the theoretical measure . 

The following holds. For fi an ergodic measure and fi^ a Gibbs state with a potential both defined on the 
same set of sequences, one has (4l l45ll30l[T3] : 

d{^i,^i^)=P{i,)^f^{^)~h{f^). (41) 

This result is used in the next section. 
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4 Finite range approximations. 

4.1 Constructing a Markov chain with finite memory. 

The main difSculty in handling the transition probabihties (jlSp and the related equilibrium state is that they 
depend on an history dating back to ri{oj^^), where ri{Lu*_r^) is unbounded. On the other hand, the influence 
of the activity of the network, say at time —I, on the membrane potential Vi at time 0, appearing in the term 
Xij{oj^^) = Y1'i=t {uj° )7~''^j(0i (eq. [12]| decays exponentially fast as I — >■ — oo. Thus, one may argue that 
after a characteristic time depending on | |og(^)| the past network activity has little influence on Vi{0). We now 
make this statement precise, especially evaluating the error attached to this approximation, before exploring its 
consequences in section Ul 

4-1.1 Range-R approximation. 

Assume that we want to approximate the statistics of spikes, given by the dynamics ([T}, by fixing a finite time 
horizon R such that the membrane potential at time depends on the past only up to some finite time —R. In this 
way, we truncate the histories and we approximate the transition probabilities P (a;(0) j ujZI^), with unbounded 
memory, by transition probabilities P (^bj{0) \ uiZ]^) where Tj(aj~^) is replaced by r,j(cj2]j) (see eq. Q), thus 
limiting memory to at most R time steps in the past. These approximated transition probabilities constitute 
therefore a Markov chain with a memory depth R. How good is this approximation ? To answer this question let 
us first construct the Markov chain within more details. 

4.1.2 Blocks coding. 

Since we are now only considering finite histories given by spike blocks of length R, of the form i^Z]ji we may 
encode each of these blocks by an integer 

N -1 

w = J2Yl 2('-i)+("+^)VW. (42) 

2^1 n——R 
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We write w ~ '^„/j- These integers or words constitute the states of the Markov chain. We note 



'^^f {o,...,2^^-i}, 



the set of words. 



4-1.3 Transition matrix. 



From prop. [51 this chain is homogeneous, i.e. transition probabilities does not depend on time. They are encoded 
in a 2^-'^ x 2^''' matrix £(^' with entries: 



^(K) dcf . 



P {uj{0)\uj_]^) , if w' UJ_]^,W Uj'^j^^^, 

(43) 

0, otherwise. 

If w' ~ Lu~^,w ~ we say that w "follows" w' or that the transition w' — > w is "legal". Note that 

when using a matrix representation where w' G O^, w £ f2^ not all transition are legal {w' ,w must correspond 
to overlapping blocks). We therefore use the convention that non-legal transitions have a zero probability. 

It results from prop. [7] that any legal transition has a positive probability. Note that the transition matrix 
corresponds to the matrix representation of the Ruelle-Perron-Frobenius (|29[) operator in the case of a finite 
memory. 

4.1.4 Incidence matrix. 

The set of transitions is encoded in a 2^''^ x 2^''^ matrix I, called incidence matrix with entries: 

1, if w' w is legal 

(44) 

0; otherwise. 

It is easy to show that X is primitive namely 3m > such that yw,w' G fi^ x , X™,^ > 0, where I"* is the 
m-th power of X. Indeed, having I™^, > means that there exists a raster plot uj which contains the block w' 
and the block w where the first spiking pattern of each block is separated by m time steps. Therefore, taking 
m — R + 1 any raster containing the concatenation of blocks w'w satisfies the requirement. 



29 



4-1.5 Range R + 1 potential. 

Using the same representation as (1421) . but with blocks of size _R + 1, to each block i^ljj of length i? + 1 we 
associate a word W = Y.'Li EL-A, 2''"^)+("+^'^a;j(7i) ~ a;° ^ and define: 



^(R)^W)^J2 ^KO)log(^( ;^^^) )+(l-a;,(0))log(l-^' 



(45) 



called a range-R + 1 potential. It corresponds to an approximation of the potential (|32p when the memory depth 



of the chain is R. Then: 
4.1.6 The Perron- Frobemus theorem. 

Since X is primitive and since all legal transitions have a positive probability, is primitive and the Perron- 

Frobenius theorem holds [19II5U) . has a real positive eigenvalue s with maximal modulus, isolated from the 

rest of the spectrum. Moreover, since is a transition probability, s = 1. The quantity 

P{iP^'^'^) = logs = 

is the topological pressure of the potential ip^-^^ [3Uj . 

The corresponding left and right eigenvectors are respectively denoted I and r i.e. I C^^^ — .si and /I^^-' r = sr 
where r as positive entries rw > 0. Without of generality we may assume that {l,r) = 1, where (,) denotes the 
standard scalar product. 

The Markov chain has a unique invariant probability measure ^.^(r) = Ir (i.e Vto G fl^ , ^.^(r){w) — IwTw). 
From this, one can compute the probability of spike blocks of arbitrary length by the Chapman-Kolmogorov 
formula: 

t-l 

M^(H)(H^^) = MV,(«) n ^SV(n+l)' 

n—s 

with w(n) ~ ujn'^^-. 
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Then, one can check that fi^{R) is a Gibbs distribution 00] (see def. I38[). Moreover, it is also an equihbrium 
state in the sense of (|HT||. As a resuh = h{fi^(R-)) + h^ir) (t/)^^-*), therefore 



4.2 Convergence of the approximation. 

Let us now discuss how well the range- 7? potential H45|) approximates the infinite range potential (|32|l . By definition 
of a range-/? -I- 1 potential ip^'^^ (lj) — iP^^\lo^j^) so that we can compare ip and ip^'^K One has: 

IIV' - ^^^^ lloo < sup { I ^{oj) - iPiij') \ ■.Lj,Lj' e X,iv{t) = Lu'{t), G {-R, . . . , 0}} =^ varRiiP), 

so that, from Theorem ((T]), 

U-i''-"^^\\oc.<K''y^, (46) 

where if' is given by (|28|l . Therefore, ■i/'^^-' approaches exponentially fast, as R growths, with a rate 7. This is 
in fact a classical result in ergodic theory: regular potential are approximated by finite-range potential in the sup 
norm where 11-0 — V'''^^l|oo < C0^ , for some < 9 < 1 (see def. [2]|. Here it is natural to take = 7. 

The implications on statistics is related to the KuUback-Leibler divergence (|4U|I . Indeed, since and 
are Gibbs distributions for the right shift T we may use (|41|) . giving: 

= PW - f^4,(R){'4') - HtJ'^w) = (V'*'^^ -^)> 

where we used P(i/') — (normalization of ip) and h{fi^(R)) = — (^'■^'') (see e.g. for a more general 
proof). Therefore, 

<^'7'^- (47) 

Therefore, the KuUback-Leibler divergence between the two measures n^, fJ^^iR) , decays exponentially fast 
with a decay rate 7. 
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A practical consequence of this result is that it might be sufficient, for practical purposes, to approximate tp 
with a potential of range: 

R^-'-2l^. (48) 

log 7 

Note however that the constant K' depend on several parameters. Especially, it diverges when ag — >■ (or 7 — > 1). 
As a consequence, depending on these parameters, the effective range can be quite large. 



5 Raster plots statistics 

As discussed in the introduction, the neuroscience community is confronted to the delicate problem of character- 
izing statistical properties of raster plots from finite time spike trains and/or from finite number of experiments. 
This requires an a priori guess for the probability of raster plots, what we call a statistical model. These models can 
be extrapolated from heuristic arguments or from principles such as the Jaynes argument from statistical physics 
[25] (see section [5. 3. 2 1) . In this section, we show that Markovian approximations introduced in the previous section 
constitute such statistical models, from which classical statistical indicators used by the neuroscience community 
can be explicitly computed in the case of model IT}. 



5.1 Two representations of the potential tp^ ' . 
5.1.1 spike-block representation. 

The potential i/;^^-' is a function of VK ~ '-^^r- Therefore, it takes only L — 2^(^+^) values, explicitly given by 
1145^ ■ To each possible block oj'^r we associate a word Wn,n = 1 . . . L. Call Xn{W) the characteristic function, 
equal to 1, if W = Wn, and otherwise. Then: 

^(R)(^W)^ E C,nXniW)=i>i''\w), (49) 
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where an ~ V' (W'n)- This decomposition of the potential is called the spike-block representation of ' and the 
index ct in (|49p (which is the vector {an)^—i) makes this representation explicit. Note that ct depend (analytically) 
on the model-parameters Wij,i,j = l...N,Ii,i = l...N,"/,d. 

5.1.2 Interpretation 

This representation is quite natural since e"" is nothing but the probability P(cj(0) | oj^jj) with Wn ~ ^^r^ 
namely a matrix element of the transition matrix (|43|l . This function corresponds to the so-called "conditional 
intensity" introduced in neuroscience data analysis by researchers like C. Pouzat and A. Chaffiol [40 and the 
exponential distribution introduced by these authors is actually our Gibbs distribution. Here, we are able to 
compute explicitly this distribution, because we are dealing with a model, while Pouzat and Chaffiol are coping 
with real world data. 

Fixing an history ^Z.\i the sum of e""'s, over all blocks Wn having an history i^Z/j a-nd such that tUj(O) — 1, 
is the probability that neuron i fires given the history More generally, the product of the transition matrix 

C elements (143 |l provides the probability of a certain sequence of spikes ("response") given a certain history. If 
one focuses on the response 7?. of a subset of neurons in the network, to spikes emitted by an another subset of 
neurons in the network-corresponding to a given history and considered as a stimulus S- the matrix C in the 
a-representation allows the computation of the probability P{TZ\S). Then, by Bayesian inference, and stnce the 
probability P{S) of the stimulus is known (it is given by the invariant measure of the Markov chain), one infers 
P{S j TV). This provides one way of characterizing the "neural code", in the sense of ^2], at the level of networks 
of neurons, where stimuli are spike trains. 

5.2 The spikes-uplets representation. 

Though natural the a-representation is not the most commonly used. Let us introduce another representation. 
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5.2.1 Monomials. 

An order-n monomial is a product tjjj(ti) . . .ci;j^(tn), where 1 < ii < 12 < • ■ • < in < A'^ and —00 < ti < f2 < 
■ ■ ■ < tn < +00, and where there is no repeated pair of indexes {i, t). Since u)i {t)^ =bji{t)yi = i...N,te > 1 
the last requirement avoids redundancies. A polynomial is a linear combination of monomials. 

The monomial (ti) . . . (tn) takes values in {0, 1} and is 1 if and only if each neuron «/ fires at times ti, 
I = 1 . . . n. On phenomenological grounds this corresponds to a spike n-uplet (ii, ti), . . . , (in,tn) (neuron ii fires 
at time ti, and neuron 12 fires at time t2, . . .). 

5.2.2 Spikes-uplets expansion ofip. 

Returning to the spike-block representation ((49]), the characteristic function of the word Wn, Xn{W), reads: 

Xn{W)=QniW)Rn{W), 

with 

QniW) = H u;l(t), 
(i,t),Wi(t) = l 

where W represents a spike block u'^^ji, while the product holds over all pairs (i,t), 1 < i < N , —R < t < such 
that ijJi{t) = 1, in the word Wn ~ i^-_r- Likewise, 

Rn{W)= W (f-a;;-(s)) = ^(-l)"i?„,™(VK), 
0-,s),Wj(s)=0 m=l 

where Rn,m are monomials of order < 7?. Since uji {tf = UJ,{t), > 1, Qn, Rn are monomials of order < R and 

all Xn{Wys are polynomials of order < R. 

For N,R £ Z, we note 'P{N,R) the set of non repeated pairs of integers (i, n) with i G {1,...,A} and 

t G {—R, . . . , 0}. We have just proved that ^ is approximated by a range- 7? polynomial expansion of the form: 

R 

V'f^W^E E (50) 

1=0 (ii,ti),...,(iiM)eViN,R), 
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where W ~ ojg . This is called the "spike- uplets representation". It is obviously equivalent to the spike- block 
representation '(/'cf^^ (the A^'s are linear combinations of the aj's) but this expansion is more convenient to discuss 
the link between our results and the standard approaches used in the neuroscience community. Note that a 
spike- block contains O's and I's (it tells us which neurons are firing and which neurons are not firing) while a 
spike-uplet takes only into account firing neurons. As a consequence there is some redundancy in the spike-block 
representation that can be removed in the spike-uplets representation (see details below). 



5.2.3 Interpretation. 

Since the analytic function log(7r(a;)) has a series expansion for a:: G IR, setting x — — f ^ in (I45|l x" 
is a sum of terms LUi-^{ti) . . .LOi^{tn) and using the series expansior|j one can compute explicitly the coefficients 
of the spike-uplets expansion. Due to stationarity (see details below) one can only consider spike-uplets of the form 
uji (O)ctJjj (t 1 ) . . . u]j^ (ti ) , with ti, . . . ,ti < 0. They are combinations of terms proportional to Wij-^ Wij2 ■ ■ ■ ^iji 7~ ^ 
which have a nice interpretation. The sum of these terms corresponds to the cumulative effect of spikes emitted 
by neurons ji, . . . ,ji at times ti, . . . ,ti in the past, on neuron i at time 0. Actually, these terms are related to a 
linear response theory as developed in a different context in [9lll0j. 



5.3 Statistical models. 

5.3.1 The topological pressure as a cumulant generator. 

Let us return to the transition matrix and related topological pressure P(-i/)'^^) introduced in sections 

14.1. 314. l"!6] These quantities depend on a or A according to the representation (which is nothing but a change of 
variables) . 

' bince idi (t)* = uji{t), Vi = 1 . . . Af, t e > 1, the terms of the series can be grouped together giving rise to a finite 
sum of monomials. 
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The pressure is differentiable with respect to A and one has[_: 

— (J) = /^^(H) Ni (ii) • • • ^ii (*;)] , 

(ii,ti),...,(i,,ii) 

Therefore, the derivation of the pressure with respect to the quantity a|''' ^ ^ ^ ^ provides the -probability 
of the spike n-uplet Ui-^ (ti) . . . uji^ (ti). 
In particular: 



dX 



(1) 



M^(H) Ki(*i)] . (51) 



n,ti 

the firing rate of neuron ii at time ti. Since dynamics is stationary this quantity does not depend on t (see eq. 
(|35|l ). As a consequence all terms Xi^t, i fixed and t £ { — R, • ■ • , 0} play the same role and we can simplify the 
potential (|50p in keeping, as first order terms, the monomials of form X^^^ uJi{G), i = 1 ... TV. 
In the same way: 

(2) = Ki(*l)^»2(*2)] , 

(il,tl),(i2,t2) 

From stationarity it follows that this quantity depends only on t2 — ii. So, there are redundant terms in the expan- 
sion (|50p and we may write the part of the expansion corresponding to pairs of spikes as X^f"— -/(•/2 X^i^ 22 = 1 '^i^^i2 t'^'i ('^)'^i2 (''') 

Higher order redundant terms can be removed as well, taking into account the stationarity of the process. As 
a consequence we may write the spike-uplets expansion of in the form: 

K 

^'^x\w) = Y.^i<t>iiW), (52) 
1=0 

where I enumerates all non redundant monomials of order < R, including the constant monomial (j)Q(W) — 1. 



* Here the pressure P{ip^^'^) is considered as a function of the A's, where these parameters are arbitrary. As a consequence 
the corresponding potential i/'lj^' is no longer normalized. In this case the topological pressure is defined as the logarithm 
of the maximal eigenvalue of the matrix C^^\ The fact that a potential of the form I I50II arc in general not normalized has 
deep practical consequences widely discussed in the paper 1591 . 
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5.3.2 Statistics of raster plots from Jaynes formalism 

We would like now to relate the present analysis to a standard problem in spike train analysis. Assume that 
we have generated a (finite) raster plot uuexp from the dynamical system ^ and that we want to recover the 
probability distribution from this raster plot, without any other information. A usual approach consist of 
computing the average value of some prescribed spike-uplets, and infer the corresponding probability distribution 
from a variational principle introduced by Jaynes [25]. The Jaynes approach has been used by several authors in 
the field of experimental spike train analysis [4911551155] . 

So, let us assume that, from the raster plot LJexp, we have computed, by time averag(f , the average value 
C; of an a priori fixed set of monomials (fn, I = 1 . . . M. To find a probability distribution Mi/it^at which matches 
these average values, without making additional assumptions, one maximizes the statistical entropy under the 
constraints M^j^^j (0;) = Ci, I = 1 . . . M. In the context of thermodynamic formalism this amounts to finding a set 
of parameters A; satisfying the variational equation \31\) for a finite range potential iptest ~ Xvf=i ^l4'l- The A;'s 
are adjustable Lagrange multipliers, which have to be tuned using (|51|l . so that the average of 0; with respect to 
Mutest equal to C; 

Let us give two classical examples. 

5.3.3 Homogeneous Bernoulli statistics. 

The simplest example consists of only measuring, thus, constraining the value of firing rates. This amount to 
considering a range- 1 potential |42II17I[20] . : 

JV 
i=l 

corresponding to a probability: 



/^Vte.t('-'(0)) = n 70) 



^ This argument extends to the case where several empirical raster plots have been generated. Then, average values are 
obtained by combinations of time average and sample average. 
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Therefore, taking only the first order monomials allows one to select a probability distribution under which neurons 
fire independently with a time-independent rate: 



5.3.4 Pairwise interactions. 

This statistical model has been introduced, in the context of spike trains statistics, by Schneidman et al |49) . 
Here, still R — 1 and tptest does not depend on t but pairs of spikes occurring at the same time are considered. 
Then, 

N 

i=l l<ii<i2<N 

Here, the related probability measure does not factorize any more but all information about spike train 
statistics is contained in the first and second order spike-uplets. 

5.3.5 Choosing an a priori set of monomials. 

More general potentials can be considered as well 36 . In view of the present analysis, fixing an a priori set of 
observables, often fixed from a priori hypotheses on the relative role of spike-uplets (e.g. rate versus synchro- 
nization), amounts to fixing a test potential tptest. Therefore, there are as many models as possible choices of 
observables. How to discriminate them ? 

The probability distribution M^^^^j is the Gibbs distribution for the potential '4^test ■ It provides an approxima- 
tion of the invariant measure /i^ in two senses. First, i/'test contains only some terms in the polynomial expansion 
of a finite range potential i/)^^), which are fixed from the a priori choice of observables. Second, t/j^^-' is a finite- 
range approximation of the exact, infinite-range, potential tp. The "error" is measured by the KuUback-Leibler 
divergence (^0)): 
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It is upper-bounded by C7 . This fixes, for model ([T}, an estimate for the value of R given by (|48|l . 

Nevertheless, the number of terms increases exponentially with R and N and therefore, especially if 7 is close 
to 1 there is an overwhelming number of monomials. Now, it might be, that some terms 0; are less important 
than others: the corresponding coefficient A; vanishes or is small compared to others terms. As discussed in [S] 
and shortly commented in section 16.31 neural mechanisms such as plasticity certainly reinforce some of these 
terms (especially rates and spike pairs). On a more abstract setting, our analysis shows that the KuUback-Leibler 
divergence (|40p gives an indication of the distance between the probability reconstructed from Jaynes principle, 
with a "guess" potential, and the true probability . This opens up a way to compare and select statistical models 
by minimizing the KuUback-Leibler divergence, using eq. (|40|l . This aspect and its numerical implementation are 
discussed in |59) . 

6 Discussion and conclusion 

In this paper we have addressed the question of characterizing the spike train statistics of a network of LIF 
neurons with noise, in the stationary case, with two aims. Firstly, to obtain analytic and rigorous results allowing 
the characterization of the process of spike generations. For this, we have used the realm of ergodic theory and 
thermodynamic formalism, which looks well adapted for this purpose. We have obtained unexpected results, 
especially the fact that, even so simple models of neural networks have, strictly speaking, an unbounded memory 
rendering spike train statistics non-Markovian. The common wisdom in the field of neural networks dynamics 
suggests, however, that there is a characteristic time scale after which the system essentially looses its memory. 
Here, this time scale is controlled by 7, the leak rate, closely related to synaptic response time. 

The second goal was to make a connection from this mathematical analysis toward the empirical methods used 
in neuroscience community for the analysis of spike trains. Here, we have shown that the Jaynes method, based 
on an a priori choice of a "guess" potential, with finite range, amounts to approximate the exact probability dis- 
tribution by the Gibbs distribution of a Markov chain [16]. The degree of approximation can be controlled by the 
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KuUback-Leibler divergence which can computed using a classical result in the thermodynamic formalism. This 
analysis opens up the possibility of developing efficients algorithms to estimate at best the statistic of spike trains 
from experimental data, using several guess potential and selecting the one which minimizes the KL divergence [59) . 

Clearly, this work is just a beginning, since, especially, it deals with a rather simple model. Let us now briefly 
comment several possible extensions. 

6.1 Conductance based Integrate-and-Fire neurons. 

A natural extension of the present works concerns the so-called Generalized Integrate-and-Fire models [44] , 
which are closer to biology [271128] . The occurrence of a post-synaptic potential on synapse j, at time results 
in a change of membrane potential. In conductance based models this change is integrated in the adaptation 
of conductances. It has been shown in [TT] that, under natural assumptions on spike-time precision that the 
continuous-time evolution of these equations reduces to the discrete time dynamics: 

V^{t+^)='y^it,Lu'_^)[l-Z{V,mV^it)+Mt,UJtoo), i=l...N, 

where: 

is the integral of the conductance gi{s,u!^^) over the time interval [t,t + 1[. Conductances depend on the past 
spikes via the relation: 

In this equation, Mj{u)*^^) is the number of times neuron j has fired at time t (it can be infinite), a is the 
synaptic profile (it decays exponentially fast) and tj"' is the time of occurrence of the n-th spike in the raster ui. 
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Gij is a positive constant proportional to tiie synaptic efficacy 

= E+G,j if j G £, 
W^j=E-G,j if j€l, 

where Ej^, , E~ are respectively the Nernst potentials for the leak, the excitatory (set £) and the inhibitory 
synapses (set X) . 
The term, 



t+i 



)Vi{s,t+ l,tj_oo)ds, 



is the corresponding integrated synaptic current with: 



ii{t, uj-ao) = — + E^ ^gij{t,u*_oo) + E ^ gij{t, uj*-oo) + i'f'^*\t), 



The difficulty here is that the coefficient 7i(f, cj^^^), which is the analog of 7 in eq. ((T| depends on the whole 
past. This introduces another non-Markovian effect in the dynamics. In this case the computation of the potential 
corresponding to (|32p is clearly more complex. This case is under current investigations. 



6.2 Non stationarity. 

One weakness of the present work is that it only considers stationary dynamics, where e.g. the external current 
li is independent of time. Besides, we have taken the limit s — >■ —00 in section [2] to remove the dependence in 
the initial condition V{s). However, real neural systems are submitted to non static stimuli, and transients play 
a crucial role. To extend the present analysis to these case one needs the proper mathematical framework. The 
non stationarity requires to handle time dependent Gibbs measures. In the realm of ergodic theory applied to non 
equilibrium statistical physics, Ruelle has introduced the notion of time-dependent SRB measure [47] . A similar 
approach could be used here, at least formally. 
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Handling the transients is an even more tricky question. The main difficulty is to propose a probability 
distribution for the initial condition V{s). From the dynamical systems point of view it is natural to take e.g. 
Lebesgue, and extensions toward this case are under current investigations. But if one wants to make serious 
extrapolations of mathematical results towards neuroscience one has to ask why the "initial state" of a neural 
network, namely the state in which the neural network is as the experiment starts, should be uniform in the 
phase space (or Gaussian or whatsoever), as soon as this initial state is the result of a previous (phylogenetic and 
ontogenetic) evolution ? 

6.3 Synaptic plasticity. 

In neural networks, synaptic weights are not fixed, as in ([l]), but they evolve with the activity of the pre- and post- 
synaptic neuron (synaptic plasticity). This means that synaptic weights evolve according to spike train statistics, 
while spike train statistics is constrained by synaptic weights. This interwoven evolution has been considered in 
[8] under the assumption that spike-train statistics is characterized by a Gibbs distribution. Actually, the present 
work confirms this hypothesis in the case of LIF models. The main conclusion of [Sj is that synaptic mechanism 
occurring on a time scale which is slow compared to neural dynamics are associated with a variational principle. 
There is a function, closely related to the topological pressure, which decreases when the synaptic adaptation 
process takes place. Moreover, the synaptic adaptation has the effect of reinforcing specific terms in the potential, 
directly related to the form of the synaptic plasticity mechanism. The interest of this result is that it provides an 
a priori guess of the relevant terms in the potential expansion. A contrario, it allows one to constrain the spike 
train statistics of a LIF model, using synaptic plasticity with an appropriate rule which can be determined from 
the form of the expected potential. 

Finally, an interesting issue fitting together with the discussion of non stationarity and synaptic plasticity, is 
to analyse spike frequency adaptation in this context [15lll8ll2l l3] 
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